Princeton has released a SWE agent that uses LLMs to fix bugs and issues in real GitHub repositories.
Wednesday, April 3, 2024Meta's video calling products rely on bandwidth estimation (BWE) and congestion control for optimal performance. Its hand-tuned system was complex and difficult to maintain, so its team developed an ML approach for network characterization and optimization, replacing hand-tuned rules to improve efficiency and reliability. The ML system analyzes network signals to classify network types and applies optimized settings for BWE and network resiliency.
Thursday, March 21, 2024This developer built a meme search engine as a way to learn about vector embeddings and image encoding. They used OpenAI's CLIP model to encode images into vector embeddings, which are a way to represent images or text as numerical vectors, for similarity searches. These embeddings were stored in a vector database, which then made memes searchable with just natural language.
Mouse clicks, scrolls, and movements leave a data trail that reveals user behavior. Machine learning can analyze this data to predict what users will do next or who they are. This can be used to personalize experiences and improve security, but factors like the type of mouse can affect accuracy.
Games that test your understanding of neural networks - choose a neural network and then try to assemble it.
This is a comprehensive collection of ideas that helps dev work with LLMs better in production. For example, RAG (Retrieval-Augmented Generation) is great at improving LLM performance and is preferred over fine-tuning for adding new knowledge to a model's context. There are tips on prompting models better, such as using JSON or XML to structure inputs and outputs. There are also guidelines on evaluating and monitoring LLM I/O properly in areas where LLMs are in a production-level pipeline.
Fantastic diffusion paper that diffuses code for images. It can directly make edits as part of the diffusion process. It is slow, but can be combined easily with search to dramatically improve reasoning ability.
- Synthetic-Domain Alignment (SDA) framework enhances models by aligning source and synthetic domains.
Researchers have developed a Synthetic-Domain Alignment (SDA) framework to enhance test-time adaptation (TTA) methods. SDA effectively aligns source and synthetic domains by fine-tuning pretrained models with synthetic data generated through a conditional diffusion model.
A collection of free ML code challenges.
Under the same number of parameters or FLOPs, they find KAN outperforms MLP only in symbolic formula representation, but remains inferior to MLP on other tasks of machine learning, computer vision, NLP, and audio processing.
Time-MoE is a project hosted on GitHub that focuses on developing billion-scale time series foundation models utilizing a mixture-of-experts architecture. This innovative approach allows for auto-regressive forecasting, accommodating various prediction horizons and context lengths of up to 4096. The repository includes several model variants, such as Time-MoE (base), Time-MoE (large), and Time-MoE (ultra), with parameter counts ranging from 50 million to 2.4 billion. The project is built on a foundation of extensive training data, with a dataset named Time-300B expected to be released soon. Users can easily get started by installing the necessary dependencies, including Python 3.10 and specific versions of libraries like transformers. The repository provides clear instructions for making forecasts using the models, including code snippets for both normalized and non-normalized input sequences. For evaluation purposes, users can access benchmark datasets and run evaluation scripts to assess model performance on specific datasets, such as ETTh1. The project encourages users to cite the associated research paper if they find the models beneficial for their work, and it provides links to related resources and papers that further explore the intersection of large language models and time series analysis. The repository is licensed under the Apache-2.0 License, and it acknowledges contributions from various GitHub repositories that have influenced its development. Overall, Time-MoE represents a significant advancement in the field of time series forecasting, leveraging cutting-edge machine learning techniques to enhance predictive capabilities.